An Effective Cost-Sensitive XGBoost Method for Malicious URLs Detection in Imbalanced Dataset

نویسندگان

چکیده

Imbalanced class has been a common problem encountered in the modeling process, and attracted more attention from scholars. Biased classifiers, which limit classifiers' performance for minority classes, will be produced if imbalanced ratio between number of positive labels negative is ignored. The synthetic over-sampling technique (SMOTE) very classic popular method, widely used to address this problem. However, SMOTE increases label noise training time during process. To improve detection rate classes while ensuring efficiency, we propose cost-sensitive XGBoost (CS-XGB) data CS-XGB method can reduce preference most without changing distribution original data. 600000 Uniform Resource Locators (URLs) were collected validate method. We compare (XGB), SMOTE+XGB CS-XGB, experimental results confirm that robust efficient cases.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cost-sensitive decision tree ensembles for effective imbalanced classification

Real-life datasets are often imbalanced, that is, there are significantly more training samples available for some classes than for others, and consequently the conventional aim of reducing overall classification accuracy is not appropriate when dealing with such problems. Various approaches have been introduced in the literature to deal with imbalanced datasets, and are typically based on over...

متن کامل

A cost effective and sensitive method for the determination of ammonia concentration in nanocrystal mordenite

The reduction capacity of ammonia while present even at ppm level can be demonstrated by the increased encounter probability between ammonia and methylene blue dye (MB+) incorporated in nanomordenite and Na-MOR zeolites. The rate of reduction methylene blue dye by ammonia on the surface nanomordenite zeolite is faster than Na-mordenite (Na-MOR) zeolite.Because nanomordenite zeolite with high si...

متن کامل

A Cost-Sensitive Ensemble Method for Class-Imbalanced Datasets

and Applied Analysis 3 costs for the positive and negative classes, SVM can be extended to the cost-sensitive setting by introducing an additional parameter that penalizes the errors asymmetrically. Consider that we have a binary classification problem, which is represented by a data set {(x 1 , y 1 ), (x 2 , y 2 ), . . . , (x l , y l )}, where x i ⊂ R represents a k-dimensional data point and ...

متن کامل

A cost effective and sensitive method for the determination of ammonia concentration in nanocrystal mordenite

The reduction capacity of ammonia while present even at ppm level can be demonstrated by the increased encounter probability between ammonia and methylene blue dye (MB+) incorporated in nanomordenite and Na-MOR zeolites. The rate of reduction methylene blue dye by ammonia on the surface nanomordenite zeolite is faster than Na-mordenite (Na-MOR) zeolite.Because nanomordenite zeolite with high si...

متن کامل

Cost-Sensitive Detection of Malicious Applications in Mobile Devices

Mobile phones have become a primary communication device nowadays. In order to maintain proper functionality, various existing security solutions are being integrated into mobile devices. Some of the more sophisticated solutions, such as host-based intrusion detection systems (HIDS) are based on continuously monitoring many parameters in the device such as CPU and memory consumption. Since the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2021

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2021.3093094